Search CORE

University of Strathclyde Institutional Repository

Genomic distance entrained clustering and regression modelling highlights interacting genomic regions contributing to proliferation in breast cancer

Author: A Bergamaschi
AB Olshen
Alan Mackay
Amar S Ahmad
Anita Grigoriadis
BM Bolstad
BS Everitt
C Desmedt
C Sotiriou
Costas Mitsopoulos
David Sims
E Huang
E Korsching
GN Lance
H Chen
H Dai
J Hicks
JP Brunet
JR Pollack
K Chin
L Shivakumar
LD Miller
M Ignatiadis
M Witcher
Marketa Zvelebil
ML Whitfield
ML Whitfield
MP Jansen
P Novak
P Novak
PT Simpson
R Edgar
S Loi
Tim J Dexter
V Raman
Y Pawitan
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression. Results We employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC) clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22. Conclusions Data-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint the key genes and interactions contributing to tumourigenicity.</p

Springer - Publisher Connector

Institute of Cancer Research Repository

King's Research Portal

Predicting Housekeeping Genes Based on Fourier Analysis

Author: AE Vinogradov
AE Vinogradov
AI Su
BM Bolstad
Bo Dong
BR Kim
BR Kim
CD Eller
D Karolchik
E Eisenberg
G Rustici
HJ de Jonge
J Ye
J Zhu
JA Warrington
Jen-Tsan Ashley Chi
KS Pollard
Li Liu
LL Breeden
LL Hsiao
M Ashburner
MJ Lawson
ML Whitfield
ML Whitfield
Peng Zhang
Runsheng Chen
S Greer
Shunmin He
T Yamada
U de Lichtenberg
X Ge
Xiaowei Chen
Yunfei Wang
Publication venue: Public Library of Science
Publication date
Field of study

Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs

arXiv.org e-Print Archive

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

Author: A Ambesi-Impiombato
A Blais
A Eto
A Subramanian
AE Kel
AG Clark
AL Lam
AM McGuire
Anat Reiner
Assif Yitzhaky
B Ren
C Kimura-Yoshida
C Plessy
C Yang
CT Harbison
D Pfeifer
D Wang
DB Allison
E Emberly
E Segal
Eytan Domany
FP Roth
GC Pipes
GC Yuan
GQ Yao
GZ Hertz
H Li
H Lodish
J Zheng
JD Hughes
JL DeRisi
JQ Ling
K Frech
K Quandt
KD MacIsaac
L Amir-Zilberstein
L Elnitski
L Marino-Ramirez
L McCue
M Ashburner
M Kellis
M Milyavsky
MA Nobrega
Mark Koudritsky
MC Frith
ML Howard
ML Whitfield
N Rajewsky
Or Zuk
P Carninci
P Carninci
P Cliften
PM Haverty
PR Buckland
R Elkon
R Liu
R Sharan
Ran Brosh
S Aerts
S Rashi-Elkeles
S Tavazoie
SJ Cooper
SJ Ho Sui
Sui Huang
U Gerland
Varda Rotter
WW Wasserman
X Xie
Y Barash
Y Benjamini
Y Benjamini
Y Tabach
Yossi Buganim
Yuval Tabach
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

CiteSeerX

Public Library of Science (PLOS)

Accelerated search for biomolecular network models to interpret high-throughput experimental data

Author: BA Sokhansanj
BA Sokhansanj
Bahrad A Sokhansanj
D Husmeier
D Repsilber
EP Gianchandani
J Gagneur
J Stelling
J Tegner
JH Holland
KC Chen
KW Kohn
L Glass
LA Soinov
M Arita
ME Csete
MKS Yeung
ML Whitfield
N Friedman
PJ Woolf
S Liang
Suman Datta
WE Combs
X Hu
XM Zhu
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. Results Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics. Conclusion Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments.</p

Drexel Libraries E-Repository and Archives

Gene expression model (in)validation by Fourier analysis

Author: A Goldbeter
B Ghosh
B Novak
EF Glynn
H Shankaran
I Chou
J Chen
J Stricker
JJ Tyson
K Shedden
M Ahdesmäki
M Tigges
MA Lema
Marianne Rooman
ME Hughes
ML Whitfield
N Friedman
P Smolen
PT Spellman
RM Gray
RN Bracewall
S Wichert
Tomasz Konopka
Y Mileyko
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

The determination of the right model structure describing a gene regulation network and the identification of its parameters are major goals in systems biology. The task is often hampered by the lack of relevant experimental data with sufficiently low noise level, but the subset of genes whose concentration levels exhibit an oscillatory behavior in time can readily be analyzed on the basis of their Fourier spectrum, known to turn complex signals into few relatively noise-free parameters. Such genes therefore offer opportunities of understanding gene regulation quantitatively.Journal ArticleResearch Support, Non-U.S. Gov'tValidation StudiesSCOPUS: ar.jinfo:eu-repo/semantics/publishe

Springer - Publisher Connector

Public Library of Science (PLOS)

DI-fusion

A Signature Inferred from Drosophila Mitotic Genes Predicts Survival of Breast Cancer Patients

Author: A Dupuy
A Shimo
AH Bild
Antonio Lembo
AP Fields
C Desmedt
C Sotiriou
C Suzuki
Christian Damasco
CL Wilson
D Coppola
EW Sayers
F Reyal
Ferdinando Di Cunto
FL Sung
H Zhao
HS Phillips
HY Chang
HY Chang
I Ben-Porath
I Skaland
JP Baak
JP Baak
JT Chi
K Shedden
K Tamura
LD Miller
LJ van't Veer
M Skrzypski
Marc Vooijs
Maria Patrizia Somma
Maurizio Gatti
MH Starmans
MJ van de Vijver
ML Whitfield
ML Whitfield
MP Somma
P Wirapati
Paolo Provero
PJ van Diest
PJ van Diest
R Liu
R Pellegrino
RC Gentleman
S Horvath
S Shimizu
SC Winter
SL Carter
SY Lin
TL Schmit
US Eggert
WA Freije
Y Galanty
Y Pawitan
Y Wang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Introduction: The classification of breast cancer patients into risk groups provides a powerful tool for the identification of patients who will benefit from aggressive systemic therapy. The analysis of microarray data has generated several gene expression signatures that improve diagnosis and allow risk assessment. There is also evidence that cell proliferation-related genes have a high predictive power within these signatures. Methods: We thus constructed a gene expression signature (the DM signature) using the human orthologues of 108 Drosophila melanogaster genes required for either the maintenance of chromosome integrity (36 genes) or mitotic division (72 genes). Results: The DM signature has minimal overlap with the extant signatures and is highly predictive of survival in 5 large breast cancer datasets. In addition, we show that the DM signature outperforms many widely used breast cancer signatures in predictive power, and performs comparably to other proliferation-based signatures. For most genes of the DM signature, an increased expression is negatively correlated with patient survival. The genes that provide the highest contribution to the predictive power of the DM signature are those involved in cytokinesis. Conclusion: This finding highlights cytokinesis as an important marker in breast cancer prognosis and as a possible targe

Archivio della ricerca- Università di Roma La Sapienza

Integration of decision support systems to improve decision support performance

Author: A Kaklauskas
A Kusiak
AC Marquez
AHB Duffy
Alex H. B. Duffy
B Chae
B Lopez
C Carlsson
C Silva
CD Evans
D Lam
D Mladenic
D Riecken
D Thapa
D Zhang
DA Guerra-Zubiaga
DR Dolk
DR Dolk
DS Linthicum
E Claver
E Thomsen
EJM Lauria
F Kebair
FD Turck
G DeSanctis
G Niu
GD Bhatt
GM Carter
H Lan
HA Simon
HA Simon
HA Simon
HY Lin
I Bose
I Boyle
I Thomas
I Truck
Iain M. Boyle
IH Witten
IK Bindoff
J Kolodner
J Zeleznikow
JE Nelson
JF Courtney
JH Lee
JH Lee
JJ Elam
JO Grady
JP Costa
JP Shim
K Eisenhardt
K Kristensen
K Pal
KQ Byung
KW Lee
L Ding
L Ekenberg
L Ekenberg
L Lin
LA Kurgan
M Alvarado
M Beynon
M Bradford
M Cohen
M Frize
M Harrison
M Limayem
M Wang
MJ Huang
MJ Shaw
ML Markus
MN Huhns
N Bolloju
NR Jennings
O Kwon
P Keen
P Keen
PA Rodgers
PC Nutt
QF Ni
R Anderson
R Anson
R Bellazzi
R Chalmeta
R Denzer
R Kimball
R Orwig
R Vahidov
RE Giachetti
RH Rao
Robert Ian Whitfield
RP Baker
RW Blanning
S Daskalaki
S Liu
S Liu
S Liu
S Szykman
SA Raghavan
SB Eom
SD Pinson
Shaofeng Liu
T Bui
TH Davenport
TJ Hess
TP Gerrity
WA Muhanna
WD Li
WD Li
Y Reich
Y Zhu
YC Tsai
Z Shi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2010
Field of study

Decision support system (DSS) is a well-established research and development area. Traditional isolated, stand-alone DSS has been recently facing new challenges. In order to improve the performance of DSS to meet the challenges, research has been actively carried out to develop integrated decision support systems (IDSS). This paper reviews the current research efforts with regard to the development of IDSS. The focus of the paper is on the integration aspect for IDSS through multiple perspectives, and the technologies that support this integration. More than 100 papers and software systems are discussed. Current research efforts and the development status of IDSS are explained, compared and classified. In addition, future trends and challenges in integration are outlined. The paper concludes that by addressing integration, better support will be provided to decision makers, with the expectation of both better decisions and improved decision making processes

University of Strathclyde Institutional Repository

Mutations of PIK3CA in gastric adenocarcinoma

Author: Agnes Sze Wah Chan
Chi Wai Wong
DS Byun
G Guanti
H Davies
H Rajagopalan
H Suzuki
I Vivanco
J Woenckhaus
JR Testa
Kent-Man Chu
L Shayesteh
M Perucho
M Sun
ML Sulis
ML Whitfield
N Itoh
O Troyanskaya
P Peltomaki
R Katso
S Malkhosyan
Samuel So
Siu Tsan Yuen
SP Staal
ST Yuen
Suet Yi Leung
SY Leung
SY Nam
Tsun Leung Chan
VG Tusher
Vivian Sze Wing Li
W Zhao
Wei Zhao
X Chen
Xin Chen
XP Zhou
Y Samuels
YY Ma
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Activation of the phosphatidylinositol 3-kinase (PI3K) through mutational inactivation of PTEN tumour suppressor gene is common in diverse cancer types, but rarely reported in gastric cancer. Recently, mutations in PIK3CA, which encodes the p110α catalytic subunit of PI3K, have been identified in various human cancers, including 3 of 12 gastric cancers. Eighty percent of these reported mutations clustered within 2 regions involving the helical and kinase domains. In vitro study on one of the "hot-spot" mutants has demonstrated it as an activating mutation. METHODS: Based on these data, we initiated PIK3CA mutation screening in 94 human gastric cancers by direct sequencing of the gene regions in which 80% of all the known PIK3CA mutations were found. We also examined PIK3CA expression level by extracting data from the previous large-scale gene expression profiling study. Using Significance Analysis of Microarrays (SAM), we further searched for genes that show correlating expression with PIK3CA. RESULTS: We have identified PIK3CA mutations in 4 cases (4.3%), all involving the previously reported hotspots. Among these 4 cases, 3 tumours demonstrated microsatellite instability and 2 tumours harboured concurrent KRAS mutation. Data extracted from microarray studies showed an increased expression of PIK3CA in gastric cancers when compared with the non-neoplastic gastric mucosae (p < 0.001). SAM further identified 2910 genes whose expression levels were positively associated with that of PIK3CA. CONCLUSION: Our data suggested that activation of the PI3K signalling pathway in gastric cancer may be achieved through up-regulation or mutation of PIK3CA, in which the latter may be a consequence of mismatch repair deficiency

Springer - Publisher Connector